Scalable and Fault Tolerant Platform for Distributed Learning on Private Medical Data

نویسندگان

  • Alborz Amir-Khalili
  • Soheil Kianzad
  • Rafeef Abugharbieh
  • Ivan Beschastnikh
چکیده

Medical image data is naturally distributed among clinical institutions. This partitioning, combined with security and privacy restrictions on medical data, imposes limitations on machine learning algorithms in clinical applications, especially for small and newly established institutions. We present InsuLearn: an intuitive and robust open-source† platform designed to facilitate distributed learning (classification and regression) on medical image data, while preserving data security and privacy. InsuLearn is built on ensemble learning, in which statistical models are developed at each institution independently and combined at secure coordinator nodes. InsuLearn protocols are designed such that the liveness of the system is guaranteed as institutions join and leave the network. Coordination is implemented as a cluster of replicated state machines, making it tolerant to individual node failures. We demonstrate that InsuLearn successfully integrates accurate models for horizontally partitioned data while preserving privacy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SCOPE: Scalable Composite Optimization for Learning on Spark

Many machine learning models, such as logistic regression (LR) and support vector machine (SVM), can be formulated as composite optimization problems. Recently, many distributed stochastic optimization (DSO) methods have been proposed to solve the large-scale composite optimization problems, which have shown better performance than traditional batch methods. However, most of these DSO methods m...

متن کامل

CumuloNimbo: A Cloud Scalable Multi-tier SQL Database

This article presents an overview of the CumuloNimbo platform. CumuloNimbo is a framework for multi-tier applications that provides scalable and fault-tolerant processing of OLTP workloads. The main novelty of CumuloNimbo is that it provides a standard SQL interface and full transactional support without resorting to sharding and no need to know the workload in advance. Scalability is achieved ...

متن کامل

Towards High-performance and Fault-tolerant Distributed Java Implementations

Java Virtual Machines form an important part of the web and business server market. Distributed Java Virtual Machines have the potential to make a significant contribution to industries that utilize this technology. An attractive platform for this purpose is the cluster, a highly cost-effective and scalable parallel computer model. However, realizing on such a platform a high performance virtua...

متن کامل

Fault-tolerant control for Scalable Distributed Data Structures

Scalable Distributed Data Structures (SDDS) can be applied for multicomputers. Multicomputers were developed as a response to market demand for scalable and dependable but not expensive systems. SDDS consists of two components dynamically spread across a multicomputer: records belonging to a file and a mechanism controlling record placement in the file. Methods of making records of the file mor...

متن کامل

A Distributed Recommendation Platform for Big Data

The vast amount of information that recommenders manage these days has reached a point where scalability has become a critical factor. In this work, we propose a scalable architecture designed for computing Collaborative Filtering recommendations in a Big Data scenario. In order to build a highly scalable and fault-tolerant platform, we employ fully distributed systems without any single point ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017